AITopics | long-term credit assignment

Contrastive Retrospection: honing in on critical steps for rapid learning and generalization in RL Chen Sun

Neural Information Processing SystemsFeb-12-2026, 18:26:03 GMT

In real life, success is often contingent upon multiple critical steps that are distant in time from each other and from the final reward.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.93)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.68)
Leisure & Entertainment > Games > Computer Games (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Would I have gotten that reward? Long-term credit assignment by counterfactual contribution analysis

Neural Information Processing SystemsDec-26-2025, 22:19:03 GMT

To make reinforcement learning more sample efficient, we need better credit assignment methods that measure an action's influence on future rewards. Building upon Hindsight Credit Assignment (HCA), we introduce Counterfactual Contribution Analysis (COCOA), a new family of model-based credit assignment algorithms. Our algorithms achieve precise credit assignment by measuring the contribution of actions upon obtaining subsequent rewards, by quantifying a counterfactual query: 'Would the agent still have reached this reward if it had taken another action?'. We show that measuring contributions w.r.t.

assignment, credit assignment, long-term credit assignment, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.42)

Add feedback

6357d6d068622c962391081d296bed69-Paper-Conference.pdf

Neural Information Processing SystemsOct-8-2025, 19:22:15 GMT

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada > Ontario > Toronto (0.14)
North America > United States > New York (0.04)

Genre: Research Report (0.93)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.68)
Leisure & Entertainment > Games > Computer Games (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Would I have gotten that reward? Long-term credit assignment by counterfactual contribution analysis

Neural Information Processing SystemsJan-20-2025, 00:02:30 GMT

To make reinforcement learning more sample efficient, we need better credit assignment methods that measure an action's influence on future rewards. Building upon Hindsight Credit Assignment (HCA), we introduce Counterfactual Contribution Analysis (COCOA), a new family of model-based credit assignment algorithms. Our algorithms achieve precise credit assignment by measuring the contribution of actions upon obtaining subsequent rewards, by quantifying a counterfactual query: 'Would the agent still have reached this reward if it had taken another action?'. We show that measuring contributions w.r.t. We run experiments on a suite of problems specifically designed to evaluate long-term credit assignment capabilities.

assignment, counterfactual contribution analysis, credit assignment, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.32)

Add feedback

Contrastive Retrospection: honing in on critical steps for rapid learning and generalization in RL

Sun, Chen, Yang, Wannan, Jiralerspong, Thomas, Malenfant, Dane, Alsbury-Nealy, Benjamin, Bengio, Yoshua, Richards, Blake

arXiv.org Artificial IntelligenceOct-27-2023

In real life, success is often contingent upon multiple critical steps that are distant in time from each other and from the final reward. These critical steps are challenging to identify with traditional reinforcement learning (RL) methods that rely on the Bellman equation for credit assignment. Here, we present a new RL algorithm that uses offline contrastive learning to hone in on these critical steps. This algorithm, which we call Contrastive Retrospection (ConSpec), can be added to any existing RL algorithm. ConSpec learns a set of prototypes for the critical steps in a task by a novel contrastive loss and delivers an intrinsic reward when the current state matches one of the prototypes. The prototypes in ConSpec provide two key benefits for credit assignment: (i) They enable rapid identification of all the critical steps. (ii) They do so in a readily interpretable manner, enabling out-of-distribution generalization when sensory features are altered. Distinct from other contemporary RL approaches to credit assignment, ConSpec takes advantage of the fact that it is easier to retrospectively identify the small set of steps that success is contingent upon (and ignoring other states) than it is to prospectively predict reward at every taken step. ConSpec greatly improves learning in a diverse set of RL tasks. The code is available at the link: https://github.com/sunchipsster1/ConSpec

conspec, critical step, prototype, (15 more...)

arXiv.org Artificial Intelligence

2210.05845

Country:

North America > Canada > Quebec > Montreal (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada > Ontario > Toronto (0.14)
North America > United States > New York (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.68)
Leisure & Entertainment > Games > Computer Games (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Google DeepMind gamifies memory with its latest AI work ZDNet

#artificialintelligenceDec-4-2019, 15:32:19 GMT

The DeepMind use simulated environments to test how a "reinforcement learning" is able to complete tasks to receive rewards. You know when you've done something wrong, like putting a glass too close to the edge of the table, only to accidentally knock it off the table a moment later. Over time, you realize the mistake even before disaster strikes. Likewise, you know over years when you made the wrong choice, like choosing to become a manager at Best Buy rather than a pro-ball player, the latter of which would have made you so much more fulfilled. That second problem, how a sense of consequence develops over long stretches, is the subject of recent work by Google's DeepMind unit.

conséquence, google deepmind gamify memory, latest ai work zdnet, (10 more...)

#artificialintelligence

Industry: Leisure & Entertainment > Games (0.30)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Beyond DQN/A3C: A Survey in Advanced Reinforcement Learning

#artificialintelligenceNov-12-2019, 23:06:32 GMT

One of my favorite things about deep reinforcement learning is that, unlike supervised learning, it really, really doesn't want to work. Throwing a neural net at a computer vision problem might get you 80% of the way there. Throwing a neural net at an RL problem will probably blow something up in front of your face -- and it will blow up in a different way each time you try. A lot of the biggest challenges in RL revolve around two questions: how we interact with the environment effectively (e.g. In this post, I want to explore a few recent directions in deep RL research that attempt to address these challenges, and do so with particularly elegant parallels to human cognition. This post will begin with a quick review of two canonical deep RL algorithms -- DQN and A3C -- to provide us some intuitions to refer back to, and then jump into a deep dive on a few recent papers and breakthroughs in the categories described above.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

#artificialintelligence

Country: Europe > United Kingdom > England > Greater London > London > Camden (0.04)

Industry: Health & Medicine > Therapeutic Area (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)

Add feedback

Beyond DQN/A3C: A Survey in Advanced Reinforcement Learning

#artificialintelligenceJun-11-2019, 23:17:22 GMT

One of my favorite things about deep reinforcement learning is that, unlike supervised learning, it really, really doesn't want to work. Throwing a neural net at a computer vision problem might get you 80% of the way there. Throwing a neural net at an RL problem will probably blow something up in front of your face -- and it will blow up in a different way each time you try. A lot of the biggest challenges in RL revolve around two questions: how we interact with the environment effectively (e.g. In this post, I want to explore a few recent directions in deep RL research that attempt to address these challenges, and do so with particularly elegant parallels to human cognition. This post will begin with a quick review of two canonical deep RL algorithms -- DQN and A3C -- to provide us some intuitions to refer back to, and then jump into a deep dive on a few recent papers and breakthroughs in the categories described above.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

#artificialintelligence

Country:

Europe > United Kingdom > England > Greater London > London > Camden (0.04)
Africa > Cameroon > Gulf of Guinea (0.04)

Industry: Health & Medicine > Therapeutic Area (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)

Add feedback

Filters

Collaborating Authors

long-term credit assignment

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Contrastive Retrospection: honing in on critical steps for rapid learning and generalization in RL Chen Sun

Would I have gotten that reward? Long-term credit assignment by counterfactual contribution analysis

6357d6d068622c962391081d296bed69-Paper-Conference.pdf

Would I have gotten that reward? Long-term credit assignment by counterfactual contribution analysis

Contrastive Retrospection: honing in on critical steps for rapid learning and generalization in RL

Google DeepMind gamifies memory with its latest AI work ZDNet

Beyond DQN/A3C: A Survey in Advanced Reinforcement Learning

Beyond DQN/A3C: A Survey in Advanced Reinforcement Learning